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(57) Abstract: A method for convolution of a digital input signal, wherein digital signals corresponding to an impulse response for at 
least an instance of a spatial environment are stored in a memory (16). Samples of the digital signal are continuously stored in an input 
buffer (14) and a set of samples, limited in time, of the digital input signal is delimited from continuously incoming samples. The 
digital signals of the impulse response are divided into a plurality of segments consisting of coefficients and convolution operations 
are repeatedly performed with a segment of the impulse response and the set of samples, limited in time, of the digital input signal 
until all segments of the impulse response have been processed. The convolution operations are performed by parallel multiplication 
of the coefficients of the impulse response and the set of samples, limited in time, of the digital input signal in a multiplier unit (12) 
and by repeated addition of the products in pairs forming partial results. Partial results are added to previously calculated associated 
partial results in an adder unit (13) forming an output signal sample. 
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METHOD AND DEVI CE IN A CONVOL U TION PROCFSS 

TECHNICAL FIELD OF THE INVENTION 

The present invention relates to a method in convolution. The method 
may be used in so called auralising in a space. Auralising implies that a cal- 
culation is made about how sounds are interpreted by the listeners both ears 
take place, so as to recreate the natural experience of space. Stereo sound 
is part of the sound experience. Instead of two separate microphones, small 
microphones may be put in the ears of a person being in the recording stu- 
dio. A person listening to the recorded sound in headphones will experience 
the same sound image as at the recording occasion. This is called artificial 
head stereo and is a foundation for auralising, where the headphone sound 
is created artificially. The invention also relates to a device for performing the 
15 method. 

The invention particularly relates to real time auralisation in large 
premises, for example concert halls. An impulse response for each ear re- 
garding a specific placing and orientation of sound source and listener is cal- 
culated in advance. It is also possible to approximately measure the impulse 
response by firing a pistol where the sound source is going to be and record 
the sound with microphones in the auditory canals of a so called artificial 
head on the location of the listener. The time until the impulse response has 
reverberated is called reverberation time. 

As long as both the sound source and the listener are still the impulse 
responses are constant. Starting out from these and a non-reverberant re- 
cording of the sound source (e.g. a musical instrument on which a certain 
piece of music is played) the sound for each ear is calculated by an opera- 
tion called convolution. It forms a filtering of the original sound dependent on 
both the reflexes (reverberation) in the premises and the direction sensitive 
30 influence on sound of the ear before it reaches the eardrum. 

The impulse responses as well as the non-reverberant original sound 
may be represented as series of numbers and all calculations may take 



20 



25 



BNSDOCID: <WO 0190927A1_I_> 



i WO 01/90927 PCT/SE01/01074 

2 

place digitally, for example in a computer. However, to perform the calcula- 
tion in real time a great calculation power is required. With a computing rate 
of 50 kHz (full CD quality) and a reverberation of the room of 2 seconds, 
2*1 0 10 (20 billions) multiplications and as many additions per second are re- 
5 quired. 

STATE OF THE ART 

The mathematical definition of convolution for the current application is 
evident from Formula 1 below. 

CO 

y(0= lKv)x(t-v)dv (1) 

0 

wherein 

x constitutes the input, 

h constitutes the convolution core (filter) and 
y constitutes the output. 

Modem signal processing is based on Fourier transforming current 
signals in the time plane to transformed signals in the frequency plane. The 
signal processing is then taking place in the frequency plane. The convolu- 
20 tion to be performed in the time plane corresponds to multiplication in the 
frequency plane. As multiplication is a simpler operation, previously signal 
processing has been implemented by multiplication in fast hardware also in 
this context. 

It is also necessary to transform from the time plane to the frequency 
25 plane and back again. For these operations there are algorithms particularly 
adapted to computers, so called Fast Fourier Transform (FFT), and the dis- 
crete equivalence thereof, with inverses. Commercially available processors 
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adapted to these calculations, so called digital signal processors (DSP), 
have been present for a long time. 

One problem with solutions in the frequency plane is that the calcula- 
tion time for FFT is growing unpleasantly with the length of the filter (N*logN). 
5 For convolutions of 100,000 points this method is unfavourable regarding the 
amount of hardware compared to performing the calculations in the time 
plane. A contributing cause is that the calculation of FFT requires calculation 
with floating point numbers, while all calculations in the time plane can be 
performed with integers. A difficulty with convolution in the time plane is that 
10 there is no known method to continuously perform such calculations in real 
time. Neither is there any known device by which the calculations could be 
performed in real time. 



SUMMARY OF THE INVENTION 

15 

One object of the invention is to provide a method for convolution of 
digital signals. This object is obtained by the invention having the features of 
claim 1 and 6, respectively. The invention solves the problem of how to per- 
form long convolutions of sound in real time using a reasonably amount of 
20 hardware. 

According to the invention, the fundamental operations in convolution 
are performed parallel in an efficient manner. Sound samples from an input 
signal and terms, or filter coefficients, from the impulse response are stored 
in registers. Each sound sample and filter coefficient are multiplied with each 
25 other separately and in parallel. Thereafter the products are added. 

The additions of the products may take place in a so called adder tree, 
wherein Included terms first are added in pairs. The sums again are added in 
pairs in a repeated sequence until a final sum is calculated. Due to the com- 
mutative rule of addition (the order is unessential), this procedure gives ex- 
30 actly the same result as if the original numbers had been added in turn. 

A key question for providing an efficient calculation is how to process 
the data included in the impulse response. An impulse response may com- 
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prise in the order of 100,000 samples. To avoid problems with a massive 
hardware contribution the impulse response is, according to the invention, 
divided into segments. Required hardware is thereby decreased dramatically 
because it may be used in a process using time multiplexing. 
5 Further, each segment of the impulse response is efficiently used be- 

cause the convolution operations are performed with the segment together 
with a plurality of samples from the input signal. The calculated results are 
added in associated positions in an output buffer, from which an output is 
fed. 

10 Additional advantages and features of the invention are evident from 

the following description, drawings and dependent claims. 

BRIEF DESCRIPTION OF THE DRAWINGS 

15 The invention will now be described in more detail with the aid of ex- 

emplary embodiments and with reference to the accompanying drawings, in 
which 

Fig. 1 schematically shows an implementation for discrete convolution 
20 in the time plane, 

Fig. 2 schematically shows one embodiment of an implementation for 
discrete convolution in the time plane according to the invention, 

Fig. 3 shows how registers in the embodiment according to Fig. 2 are 
co-operating during different phases of the convolution, 
25 Fig. 4 shows how contents of registers are changing during different 

phases of the convolution, 

Fig. 5 schematically shows an implementation of an output buffer be- 
ing used in the embodiment according to Fig. 2, and 

Fig. 6 shows the function of a calculating unit in the embodiment ac- 
30 cording to Fig. 2. 
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DESCRIPTION 

A discrete convolution in the time plane may take place according to 
5 Formula 2 below. A practical example of implementation is shown in Fig. 1 . 

K0 = X>(v)x(7-v) (2) 
i 

wherein 

x constitutes input, 
10 h constitutes a convolution core (filter) and 

y constitutes output. 

The basic operations in convolution may be very efficiently paralleled. 
Input samples are fed into a first shift register 10. A corresponding first filter 

1 5 register 1 8 holds the impulse response. Sound samples and impulse re- 
sponses are stored in registers 10 and 18, of which each value has its own 
direct output. For each pair of sound samples and filter coefficients a sepa- 
rate unit for multiplication is provided, i.e. all multiplications are performed in 
parallel. Fig. 1 shows all the units for multiplication combined in a multiplier 

20 unit 12. All of the results of these multiplications are then to be added, and 
this may also be performed in a single step in an adder unit 13. 

An efficient manner of organising the addition is to add the included 
numbers in pairs first. Then you get half as many numbers and these may 
then be added in pairs in a similar manner. As a result you have one single 

25 number after a few steps. Thus, a so called adder tree is used. Due to the 
commutative rule for addition (the order is unessential) you know this is ex- 
actly the same result as if the original numbers were added in parallel or in 
order. As the additions in pairs in each step are performed in parallel the to- 
tal time is the same as for one, i.e. the total calculation time for the whole 
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adder tree is proportional to 2 logN instead of N, wherein N is the total number 
of bits of the included numbers. 

Still the delay in the total calculation chain may become too long in 
proportion to a certain clock cycle time without further measures. This prob- 
5 lem may be solved by placing a number of registers on the way and divide 
the calculation on a number of clock cycles (pipelining). 

It is possible to perform the multiplication in a single (combining) step. 
This will limit the clock frequency and it may be sufficient to place one regis- 
ter before and one after the multiplier, and one inside and one after the ad- 

10 der tree. If an even faster solution is desired the multiplication must also be 
divided in more pipeline steps. 

In the following discussion the starting point is an example having a 
data transfer rate of 50 kHz for the sound, a clock frequency of 50 MHz for 
the digital electronics (1000 times faster) and a reverberation time of 2 sec- 

15 onds and a 16 bit data width for both sound and impulse response. An im- 
pulse response of 100000 samples is used. By using such high clock fre- 
quency it becomes possible to divide the problem into segments and time 
multiplex the hardware, i.e. make the hardware one thousandth as wide and 
instead use it 1000 times. Thus, the narrow version of the hardware has ex- 

20 actly the same structure as the original version having the adder tree and 
pipelining. 

A practical embodiment of a convolution unit according to the inven- 
tion is schematically shown in Fig. 2. The input is introduced through an input 
buffer 14, suitably arranged as a shift register. The input buffer 14 is con- 

25 nected to a first register 10 through a latch 15. The first register 10 is opera- 
tively connected to a second register 11. The first register 10 and the second 
register are suitably arranged as shift registers. The first register 10 and the 
second register are fed back through a feedback loop 25 and the latch 15, so 
that it is possible to shift data back and forth between the registers in a circu- 

30 lating process. The process is described in detail below. 

A memory 16 is arranged for storing the impulse response. The im- 
pulse response is divided into segments and operates as a filter on the input 
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signal. In the embodiment shown, the memory 16 is designed for storing 
1000 segments, each of 100 samples, or terms. A segment of the impulse 
response is processed together with a corresponding segment of input. A 
first calculation step is taking place in a multiplier unit 12. The multiplier unit 

5 12 comprises a plurality of multiplication means for parallel multiplication of 
specific samples from the input signal and from the impulse response. The 
segment of the impulse response to be processed is transferred through a 
multiplexer 17 to a first filter register 18. The memory 16 is also connected to 
a second filter register 19 through the multiplexer 17. 

10 The embodiment shown in Fig. 2 is particularly suitable for pipelining. 

Accordingly, included components, e.g. registers and calculation units, com- 
prise a plurality of logical blocks, wherein each block is performing a part of 
an operation. Several operations in series are thereby performed apparently 
at the same time. 

1 5 One filter register is sufficient if the head of the listener is completely 

still. If the listener is turning the head the same echogram may be used, but 
new impulse responses must be calculated. This may be performed on a 
modern PC in a few tens of milliseconds, fast enough to create an apparently 
continuously active sound image following the head turning in real time. 

20 Also when the head is turning a correct reproduction may be provided 

by doubling the memory for impulse responses in a similar manner as corre- 
sponding buffers in the convolution unit. While one memory is used for con- 
volution the other is filled with new contents. Alternation between the memo- 
ries may take place momentarily. 

25 Accordingly, while data from the first filter register 1 8 is processed, 

new filter data from the impulse response may be transferred to the second 
filter register 1 9. Data is alternately used and loaded in the both filter regis- 
ters 18 and 19, so that the processing may take place without any delay for 
loading of registers. 

30 In the embodiment according to Fig. 2 and having a suitable clock fre- 

quency of the oscillator controlling the electronics, all hardware will be used 
all the time in continuous operation. One way to make the solution more effi- 
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cient is to adapt the different calculation units so that an unnecessary num- 
ber of bits not is used in each case. By this manner it is possible to increase 
the rate and decrease the amount of hardware. 

Each cell of the second shift register 1 1 and of the filter registers is 
5 connected to a specific multiplication means of the multiplier 12, so that the 
multiplication may take place in parallel. The result from each multiplication is 
of the length of 32 bits, if the factors included have 16 bits. However, nor- 
mally only the 1 1 most significant bits of the result need to be used. It may 
be even more efficient to adapt the multiplication means, so that they only 

10 calculate the bits needed. Then, 11+11 bits are used in a first step of the ad- 
der unit 13, which in turn gives a result of 12 bits. Consequently the number 
of bits is increased with one for each step in the adder tree. Dependent on 
the number of steps and the number of segments only as many bits as 
needed are included in the final result. 

15 An output 20 of the adder unit 1 3 is operatively connected to a calcula- 

tion unit 21 . A control unit 22 is operatively connected to the calculation unit 
21 and an output buffer 23. The control unit 22 is ensuring that the partial 
results from the convolution operations available on the output 20 are added 
to an associated previously calculated partial result stored in the output 

20 buffer 23. Suitably the control unit 22 is also controlling remaining compo- 
nents, e.g. the shift registers, the calculation units and the multiplexer. Both 
the multiplier unit 12 and the adder unit 13 are suitably arranged for pipelin- 
ing. 

The function of the circuit in Fig. 2 may be described schematically in 
25 the following way. Sound samples are shifted into the registers, and the dif- 
ferent parts of the impulse response are stored in the memory 16, so as to 
be loaded into one of the filter registers, one part at a time. The register is 
doubled, one loading while the other is used for convolution and then the 
operation is alternated (the actual change is taking place between two clock 
30 cycles and is not taking any time). Then, as output no longer is produced at 
the correct rate, an output buffer 23 in the form of a memory having a par- 
ticular calculation unit is introduced, see the description of Fig. 6 below. 
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While the filter segment is stored anyway, convolution operations a 
few points ahead in time (already registered) are performed. This is taking 
place by the means of a trinominal input buffer comprising the input buffer 14 
and the shift registers 10 and 1 1 , in which input data are "swung" back and 
5 forth in both the shift registers while new samples are read into the input 
buffer 14. The results then are superimposed in the correct positions in the 
output buffer 23 and eventually are fed out in the correct rate. 

In connection with the managing of head turning, which is schemati- 
cally described above, a smaller modification of the controlling of the output 

10 buffer is also required, so that the buffer is set to zero only at start up with a 
new sound and not when new impulse responses are loaded. By this manner 
a "sliding transition" between filters for two discrete directions is automatically 
obtained. It is not necessary to change filters more often than what is corre- 
sponding to about half of the reverberation time, i.e. once per second in the 

1 5 above described example. 

Fig. 3A-3D show how an input may be used in an efficient manner. In 
a standard position, which is shown in Fig. 3A, new input samples are shifted 
in through an input 24 of the input buffer 14. The content of the input buffer 
14 is then shifted further into the first shift register 10 and the second shift 

20 register 1 1 . During the convolution operations the input buffer 14, the first 

shift register 10 and the second shift register 11 will contain different genera- 
tions of input data. 

During the convolution operations the shift registers are connected to 
each other in the manner shown in Fig. 3B. The input buffer 14 is separated 

25 from the shift registers and is not allowed to change any register content. 

However, incoming input samples continue to be shifted into the input buffer 
14 at the rate they arrive. 

While one segment of the impulse response is read into one of the fil- 
ter registers, e.g. the first filter register 18, input data of the first shift register 

30 10 and the second shift register 1 1 , are "swung" back and forth until all sam- 
ples are combined with the samples of the impulse response in the first filter 
register 18. The partial results originated in each position are added to the 



0190927A1 I > 



WO 01/90927 PCT/SE01/01074 

10 

associated positions of the output buffer through the calculation unit 21 and 
are eventually fed out in the correct rate. With a solution adapted to the ex- 
ample all buffers/registers are 100 positions, or samples, long and 16 bits 
wide and the memory 16 for the impulse response contains 1000 segments. 

5 Accordingly, the convolution unit is occupied every clock cycle when in a 
running operation. 

In the state shown in Fig. 3C all samples from a segment of the input 
signal are used. Then the input buffer 14 contains a completely new set, or 
generation of input data, designated G(n). The first shift register 10 contains 

10 the previously used data corresponding to a generation G(n-1) and the sec- 
ond shift register contains even earlier used data, corresponding to a gen- 
eration G(n-2). 

Then, switching into the state shown in Fig. 3D is taking place, in 
which the content of the input buffer 14 has been transferred to the first shift 

15 register 10. This is possible since the latch 15 of the control unit has received 
instructions to open for the transfer. Then, the first shift register 10 contains 
the generation G(n) data. The input buffer is set to zero and prepared for 
introduction of new input samples simultaneously. At the introduction of new 
data from the input buffer 14 into the first shift register 10 the previously used 

20 data is simultaneously shifted from the first shift register 1 0 to the second 
shift register 1 1 , which then contains the data of generation G(n-1). Data of 
the second shift register 1 1 is not fed into the first register 10, since the 
feedback loop 25 between the second shift register 1 1 and the first shift reg- 
ister 10 is broken. This may be accomplished by having the latch 15 break- 

25 ing the connection. 

In the state shown in Fig. 3E the latch 15 has broken the connection 
between the input buffer 14 and the first shift register 10 again, while the 
feedback loop 25 is closed once again. A new series of convolution opera- 
tions may be initiated and a new generation of input G(n+1) is cumulated in 

30 the input buffer 14. 

Each convolution results in one output point and it has been calculated 
based on a specific position of the buffer circulating back and forth and a 
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specific segment of the impulse response. To indicate the position in the 
data buffer the initial point is that 100 new data just were shifted in, see Fig. 
4. This position is called B(0). The next clock impulse gives the position B(1), 
etc. until B(99). Data is first shifted to the right through the registers until all 
5 data from the first register have been shifted into the second register 1 1. In 
each position a convolution operation is taking place. After that the same 
data is used once again by shifting data in the reversed direction. Due to the 
"swinging" the next position is B(98). Hence, the absolute first convolution is 
descending from l(0) and B(0). This value is referred to as l(0)B(0). The next 

10 value produced is then l(0)B(1) etc. Then, after l(0)B(99) comes l(1)B(98). 

When all the segments of the impulse response have been processed, 
1 00 new samples are shifted in and the process starts over from the begin- 
ning with l(0)B(0). This value is to be added to the previously stored l(1)B(0). 
Consequently a designation for the point of time of the input must be intro- 

15 duced to separate the convolution results. The time for the first input buffer is 
called T(0) etc. With this addition the value for the new output point (no. 100) 
will be T(0)I(1)B(0) + T(1)I(0)B(0) after updating. Due to the "swinging" the 
time progress becomes complicated. The comprehension of the calculations 
may be facilitated by focusing on the result in the output buffer 23. 



20 



The following designations are introduced: 



m 



the number of elements in the convolution register, 



n 



the number of segments in the impulse response 



memory 



25 



OQ) 



element in the output buffer 



In a so called "steady state", i.e. when the convolution has been in 
progress for some time, the following Formula 3 applies: 



n-\ 



OU) = Yn P + i)I<ip + i)B(q) 



(3) 
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wherein 

p = 0,l, ... 

q = 0, 1, m-1 

j=p + q 

5 

When the convolution is starting all values in the output buffer 23 will 
be set to zero. As appears in the above table it takes a certain time before all 
the terms for updating of each output point is available. More specifically will 
it take exactly as much time as the impulse response is long. 
10 In practise the output buffer 23 may be designed as a regular RAM 

memory, which is organised as a ring buffer in accordance with Fig. 5. The 
ring buffer comprises an address pointer 26 for start and one address pointer 
27 for end. Between start and end the buffer is set to zero. A control unit 
generates the required addresses so that updating in each moment will take 
15 place in the correct position and that the outputting of data takes place in the 
correct manner. At the same time as a data point is fed out it is set to zero in 
the ring buffer and becomes thereby prepared for eventually becoming the 
first value in the buffer again. 

If the convolution unit produces one result every clock cycle, it will be 
20 necessary to read out an old value, add a new value and write back the re- 
sult to the output buffer during one clock cycle. For example, this may be 
solved by designing a RAM circuit, so that two values may be reached at a 
time. Consequently a calculation unit, which is reading, accumulating and 
writing, must be present between the convolution unit and the memory. 
25 Four additions are performed during a process step. As the four addi- 

tions are not evenly distributed between the four clock cycles, at least one 
buffer register must be present for intermediate storing of the convolution 
results from one clock cycle to another. A practical embodiment of such a 
calculation unit 21 is disclosed in Fig. 6. The calculation unit 21 comprises 
30 two parallel pipelines 28 and 29. During four clock cycles the two parallel 
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pipelines 28 and 29 are reading and writing, respectively, in two and adding 
in four. 

In clock cycle 1 there is a transfer of values from the output memory to 
the buffer register #1 and the buffer register #2. 
5 In clock cycle 2 a new result value from the convolution unit is added 

to the value in the buffer register #1 . Further, a result value from the convolu- 
tion unit is added to the value in the buffer register #2. There is also a trans- 
fer of values from the output memory to buffer register #3 and buffer register 
#4. 

10 In clock cycle 3 the values in the buffer register #1 and buffer register 

#2 are transferred to the output buffer 23. A new result value from the convo- 
lution unit is simultaneously added to the value in the buffer register #3 and a 
corresponding addition to the value in the buffer register #4 is made. 

Finally, in clock cycle 4 the values in the buffer register #3 and the 

1 5 buffer register #4 are transferred to the output buffer 23. The common con- 
trol unit 22 is processing these calculations and addresses the memory. 

In the above described solution it is assumed that the head of the lis- 
tener is completely still. If the listener is turning the head the same echogram 
may be used , but new impulse responses must be calculated. This may be 

20 accomplished in a few tens of milliseconds using a modern PC, which is fast 
enough to create an apparently continuously active sound image following 
the head turning in real time. 

A possible completion to process head turning is to double the mem- 
ory 16 for impulse responses in a similar manner as the corresponding filter 

25 register 18 and 19 in the convolution unit. While one memory is used for 

convolution the other is loaded with a new content. Alternation between the 
memories may take place momentarily, so that the convolution does not 
need to be interrupted. A smaller modification of the control of the output 
buffer is required so that the buffer is set to zero only at start with a new 

30 sound and not when new impulse responses are loaded. By this a "sliding 

transition" between filters for two discrete directions is obtained. It is not nec- 
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essary to change filters more often than that corresponding to about half of 
the reverberation time, i.e. once per second in the described example. 
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CLAIMS 

1. A method for convolution of a digital input signal, characterised 
5 by 

storing digital signals corresponding to an impulse response for 
at least an instance of a spatial environment, 

continuously storing samples of said digital input signal, 

delimiting a set of samples, limited in time, of said digital input 
10 signal from continuously incoming samples, 

dividing the digital signals of said impulse response into a plural- 
ity of segments consisting of coefficients, 

repeatedly performing convolution operations to a segment of 
the impulse response and the set of samples, limited in time, of the digital 
15 input signal until all segments of the impulse response have been processed, 

performing said convolution operations by a parallel multiplica- 
tion of the coefficients of the impulse response and the set of samples, lim- 
ited in time, of the digital input signal and by repeated addition of the prod- 
ucts in pairs forming partial results, and 
20 adding partial results to previously calculated associated partial 

results forming an output signal sample, whereby the convolution is per- 
formed in the time plane. 

2. A method according to claim 1, wherein the convolution operations are 
25 performed by pipelining. 

3. A method according to claim 1, wherein the number of coefficients in a 
segment of the impulse response is equal to the number of samples in the 
set of samples, limited in time, of the digital input signal. 

30 

4. A method according to claim 1 , wherein convolution operations are re- 
peatedly performed with a first segment of the impulse response, a second 
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segment of the impulse response being prepared for the convolution opera- 
tions simultaneously. 

5. A method according to claim 1 , wherein the partial results from the convo- 
lution operations are stored until all coefficients in all segments of the im- 
pulse response have undergone convolution operations. 



6. A device for convolution of a digital input signal, characterised in 

10 

that a memory (16) for storing digital signals corresponding to an im- 

pulse response for at least an instance of a space environment, 
is operatively connected to at least a first filter register (18) 
that the memory (16) and the first filter register (1 8) are divided into a 

1 5 plurality of segments, each of which segments having a plurality 

of cells, wherein each cell may contain a coefficient of the digital 
signals of the impulse response, 
that an input buffer (14) is arranged for storing samples of the digital 

input signal, 

20 that the input buffer (14) is operatively connected to memory means 

(10,11), 

that the memory means (10, 1 1) is designed for gradually shifting 

data back and forth between cells of the memory means (10, 
11), 

25 that each cell of the first filter register (1 8) is connected to a first input 

terminal of a multiplier unit (12) comprising a plurality of multipli- 
cation means, and cells of the memory means (10, 11) are con- 
nected to a second input terminal of the multiplication unit (12) 
for parallel multiplication of the contents of the cells, 

30 that an output of each multiplication means is connected to input 

terminals of an adder unit (13) for addition of the products from 
the multiplications to a partial result, 
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^ 



5 



that 



that 



that 



an output of the addition unit (1 3) is operatively connected to an 
output buffer (23) through a calculation unit (21), 
the calculation unit (21) comprises addition elements, and 
a control unit (22) is operatively connected to the calculation unit 
(21) and the output buffer (23), for controlling the calculation unit 
(21) to add new partial results to previously calculated associ- 
ated partial results forming an output signal sample. 



10 7. A device according to claim 6, wherein the input buffer (14) is operatively 
connected to a first shift register (10) for transferring a set of samples, limited 
in time, of the digital input signal, and the first shift register (10) is operatively 
connected to a second shift register (11) for gradually shifting data back and 
forth between cells of the first shift register (10) and the second shift register 



8. A device according to claim 6, wherein the multiplier unit (12) is designed 
for working with pipelining. 

20 9. A device according to claim 6, wherein the adder unit (13) is designed for 
working with pipelining. 

10. A device according to claim 6, wherein the memory (16) is operatively 
connected to a second filter register (1 9) for storing coefficients therein, and 

25 wherein the first filter register (18) and the second filter register (19) are al- 
ternately operatively connected to the multiplier unit (12) and the memory 
(16), respectively. 

1 1 . A device according to claim 6, wherein the memory (16) is doubled to 
30 allow convolution without interruptions when changing the impulse response. 



15 (11). 
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